"Winner takes it all": strongest node rule for evolution of scale free networks 
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We study a novel model for evolution of complex networks. We introduce information filtering for 
reduction of the number of available nodes to a randomly chosen sample, as stochastic component 
of evolution. New nodes are attached to the nodes that have maximal degree in the sample, which 
is a deterministic component of network evolution process. This fact is a novel for evolution of 
scale free networks and depicts a possible new route for modeling network growth. We present both 
simulational and theoretical results for network evolution. The obtained degree distributions exhibit 
an obvious power-law behavior in the middle with the exponential cut off in the end. This highlights 
the essential characteristics of information filtering in the network growth mechanisms. 
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I. INTRODUCTION 



Recently, there have been a number of extensive inves- 
tigations in the field of complex networks. With such an 
extensive effort a number of important theoretical and 
practical results have been reported^, 0, Many real 
world systems can be described as complex networks: 
www jij, internet routers [ESHi proteins 0, scientific 
collaborations Q, among others. The main features that 
separate complex networks from "ordinary" networks are 
the famous small world effect ^3 ^^id the scale free de- 
gree distribution [Tll |. 

The first and simplest model for the scale free distri- 
bution of degrees in complex network was proposed by 
Albert and Barabasi [l^ (thereafter referred to as AB 
model) . This model is based on a simple principle of pref- 
erential attachment. The network grows in such a way 
that at each time step t a new node is introduced into 
the network and attaches itself to some of older nodes 
designated by the moment s when they entered the net- 
work. The probability that the node t will attach itself to 
a node s is linearly proportional to the degree kg of the 
older node Pt-^s ~ fcs- Using this simple principle a scale 
free network of exponent 3 is easily reconstructed. Al- 
though very appealing because of its simplicity the AB 
model cannot correctly reproduce all characteristics of 
real world networks. First, it produces a temporally cor- 
related network in the sense that older nodes tend to have 
more edges than younger ones, which was not observed 
in real data . Second, it assumes that every new node 
has the complete information about the whole network, 
which is unrealistic for real network formations |l4l Il5j | . 
Third, in its original form, it reproduces only networks 
with degree distribution characterised by exponent 7 = 3. 
Nevertheless the AB model has triggered a huge number 
of models that try to avoid these shortcomings, but are 
also a natural extension of the original. Among others 
there are models with nonlinear prefercntiality jig , with 
rewiring of edges at later times |l7l , w" ith a fitness param- 
eter as an intrinsic value of a node etc. Although 
novel and more complex approaches, that describe a va- 



riety of degree distributions and have more support in 
the real data, have been studied recently [23, |21| , we be- 
lieve that it is also of fundamental importance to examine 
"as simple as possible" processes that capture essential 
behavior of real world networks. 

In this paper we present a novel model which exhibits 
power-law-like degree distribution of an undirected net- 
work or the in-degree power-law-like distribution of a di- 
rected network. The purpose of the model is to test infor- 
mation filtering as a stochastic component of the network 
evolution process, while using a simple deterministic rule 
for attachment of new nodes. The results we report in 
this article clearly show that our model can reproduce 
power-law distributions but also power-laws with a cut 
off, similarly to some real data reported recently . 



II. MODEL 

Our model introduces two crucial features that make 
it different from the Albert-Barabasi model. A new node 
is introduced into the network at each time step. For 
simulation purposes, we first generate a network of IIOO 
nodes which are completely randomly connected to each 
other. Each new node in this core is connected to one 
of the older ones with uniform probability, until a core is 
formed. The size of the core is taken to be 1100 because 
we chose to monitor filtration subsets up to 1000 nodes. 
After the core is formed, the following procedure takes 
place. Each new node attaches itself to the network with 
Lo links. To choose to which of the already present nodes 
in the network it will attach itself, the following rule is 
applied, i) A sample of the already present nodes of fixed 
size m is randomly chosen from the network which con- 
tains t nodes. The probability of chosing any node in the 
sample equals m/t. ii) Chosen nodes are sorted by their 
degree in the decreasing order. For the nodes with the 
same degree no additional rearangement is applied, iii) 
From such a sorted sample, a new node is attached to 
the first uj nodes that have the highest degree. The third 
rule is a simple deterministic "winner takes it all" algo- 
rithm, which combined with the first two rules produces 
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FIG. 1: Simulated cumulative probability functions with 
m = 100 and uo = 1 for different final network sizes rimax 
are compared to the theoreticaly obtained one. The figure 
clearly depicts the asymptotic approach of simulation curves 
to the theoretical result. This implies that analytical results 
are precise and that they sufficiently well describe the be- 
havior of the system when Umax — > oo. The inlet gives the 
enlarged section with the tails of the simulated cumulative 
probability distributions to better illustrate the effects of the 
finite network size. 



very interesting macroscopic effects, as wiW be presented 
in this paper. 

The nodes are numbered from 0, and the network is 
grown to the size Umax- We averaged over 100 simu- 
lations for every investigated w, m, and Umax in order 
to get a statistically relevant ensemble of network real- 
izations. We also performed a scaling investigation pre- 
sented in Fig. n]to see how a simulated distribution be- 
haves for different network sizes. 
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Here the first binomial coefficient in the numerator rep- 
resents number of possible ways to chose I nodes with 
degree smaller than k into the sample, and 
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i3(fc,i-l) = ^7V(q,i-l) 



(2) 



where N(k,t) is the number of nodes with degree k 
at time t. The second binomial coefficient counts the 
number of possible ways to chose to — 1 — / nodes with 
the same degree as node s into the sample. This part 
of expresion ^ accounts for the possibility that in the 
selected sample exist other nodes with the same maxi- 
mal degree as s. Using the fact that N{q,t) — P{q,t) ■ t, 
together with approximation that for large t one can ap- 
proximate ^ y^i^ ^ with t"'' /ml, we reduce the expression 
to the following form: 
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where 



III. THEORY 

In the theoretical treatment of the node degree distri- 
bution we decide to limit ourselves to the description of 
network with lo = 1. The reason for such an approach 
is a cumbersome analytical study for the case oi u > 1, 
which would include many more summation terms that 
are analytically almost unentanglablc. We use the mas- 
ter equation approach of Dorogovtsev and Mendes [23| . 
In this approach a new node enters the network at ev- 
ery moment s and is therefore denoted by s. It connects 
with one edge to the node with maximum degree in the 
randomly selected sample of size to. Nodes in sample 
are selected from t nodes that are already present in the 
network, so that every existing node has the probability 
Y of entering the sample. 

The probability that the node s with degree k will enter 
the sample of size to at time t and will be chosen for the 
attachment of the new node is 



k-l 

n(ft,t-l) = ^P(g,t-l). (4) 

9=1 

Using the well-established Dorogovtsev-Mendes mas- 
ter equation approach for calculating the node degree 
distribution, for k >2 we write 

p(k,s,t) = u(fc — 1, TO, i — l)p(fc — 1, s, < — 1) + 

(1 - w(fc,TO,t- l))p(fc,s,i- 1). (5) 

To calculate the probability distribution P{k, t) that a 
randomly chosen node has k edges at time t, we average 
the probability distribution of all nodes s, i.e. 

P{k,t) = -—Y,P{k,s,t). (6) 

s=0 

Thus we obtain 
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Assuming that Eq. (TJ has a stable asymptotic solution 
for t ^ \ thus changing the time-dependent probability 
distribution into time independent P{k,t) = P{k), we 
obtain the following closed form: 



P{k) = C(fc - ^)P{k - 1) - Cik)P{k). 



(9) 



Equations are a polynomials of order m and hold 
for all k > 2. Written as polynomials, they adopt the 
following form: 



a(0)F(fc)" + a(l)P(fc)"-i + ... + a{l)P{k)"'-^ 

+ (l + a(m-l))P(fc) 
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where the coefRcients a{l) are 



a{l) = 




(11) 



For theoretical treatment of as our boundary con- 
dition the following equation holds: 



p{l, s,t) = S,^t + {l-Ss.t) (1 - v{l,m,t- l))p{l, s,t~l), 

(12) 

with an obvious relation for probability that a node 
with one edge at time t~l will adopt a new edge at time 
t: 



vil,s,t-l) = 



P(l,t-1 
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(13) 



Using a procedure similar to that already mentioned 
above, we obtain the asymptotic value for P(l): 
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Simulation 



P(l) = l-P(l)' 



(14) 



FIG. 2: Theoretical probability distribution (solid line) nicelly 
follows simulation data (black diamonds) for m — 1000. Scat- 
tering in the tail is a consequence of low probability fluctu- 
ations induced by finite size effects. The reader should also 
note a big jump of probability for P{k = 1). 



Unfortunately, the set of Eqs. H10|l and 114|l is analyti- 
cally unsolvable and is therefore solved numerically. The 
solutions of these polynomial equations show excellent 
agreement with numerical simulations as can be seen in 
Figs. El El EI and|3 These findings further vindicate the 
master equation approach followed in this paper. 



IV. DISCUSSION 

As we have mentioned in the preceding section, a mas- 
ter equation approach yields a chain of the polynomial 
equations (fTn|l and ifTHl . Note the fact that P{k*) repre- 
senting the probability that a randomly chosen node will 
have a degree k* depends only on degree probabilities 
that are equal or less than k* ^1 We have calculated the 
roots of the system to get a degree probability distribu- 
tion. 

All simulated data and analytical roots of polyno- 
mial equations exhibit a big jump from P{k = 1) to 
P{k = 2) of order of a magnitude or more. The dif- 
ference P{k = 1) — P{k — 2) depends strongly on the 
size of a chosen sample m. If the size of the sample is 
larger, then there is higher probability that a node of de- 
gree larger than 1 will enter the sample, and collect the 
new link. The smaller the sample the greater the prob- 
ability that only nodes of degree one will be chosen in 
the sample, thus lowering the overall amount of nodes of 
degree one. The obtained analytical solutions from Eq. 
(|14|l are in excellent agreement with simulational results 
regarding to this jump. The average relative error for 
TO G {10, 100, 1000} simulation and theory is 4.3 • 10"^, 
and gets smaller as the sample size m grows larger for 

^max — 10 . 

All simulated data exhibit a strong scattering in the 
tail. The scattering is a consequence of low probability 
fluctuations and makes the comparison between theory 
and simulation more dificult, Fig[21 In order to straighten 
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FIG. 3: For m = 10 the theoretical distribution (solid line) 
nicelly follows simulation data (black dots). The disagree- 
ment in the very tail is explained by finite size efi'ects of sim- 
ulated data. However, a FS theoretical distribution obtained 
by transformation l|16|l shows even better agreement with sim- 
ulation 




k 

FIG. 4: For m = 100 it is easy to see that the theoretical 
distribution follows simulational data very well, and FS the- 
oretical distribution even better. 
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FIG. 5: m — 1000 is the largest monitored sample size but is 
still small enough compared to simulational number of nodes 
= lO'*. Theory is in excellent agreement with simula- 
tion. 
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FIG. 6: Two different functions: i) stretched exponential and 
ii) power law with the cut off are fitted on theoretical data for 
sample size m — 10. This figure clearly shows that the power 
law with the exponential cut off better describes the tail of 
the theoretical distribution. 



up the data and compare theory and simulation, it is pos- 
sible to use exponential binning or to transform proba- 
bility distribution into the cumulative probability distri- 
bution. We implemented the second approach and pro- 
duced a cumulative degree probability distribution Pcum- 



q—k 

This distribution contains the same system informa- 
tion as the degree distribution, but is much smoother in 
the tail. We compared our theoretical curve with the sim- 
ulated one and found an excellent match between theory 
and simulations. The results of the comparison between 
simulation and theory are presented in Figs. |2| El a-nd 
|S1 The relative disagreement observed in the tails is a 
consequence of finite size effects, Fig^ Since our theo- 
retical curve falls down relatively slowly, as can be seen 



in tabled the summation of probabilities for k > k^ax in 
Eq. H15|) contributes strongly to the cumulative degree 
probability in the tail. To get an even better match, we 
calculated "renormalized" cumulative probability distri- 
bution 



Pcum{k) — - , , . (16) 

This Finite Size cumulative probability distribution 
(hereafter denoted as FS theoretical distribution) is even 
better in describing finite size effects, as shown in the 
FigsEIH and El 

To obtain a description of the degree distribution in 
the thermodynamical limit, we fitted theoretical cumula- 
tive degree distribution (theoretical and not simulational 
distribution was also used since it does not suffer from fi- 
nite size effects) with the stretched exponential (I17f) and 
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FIG. 7: Fitted curves for m = 100. Power law with the 
exponential cut off represents the theoretical distribution very 
well. 



ower-law distribution with the exponential cut off (|18|l , 



FIG. 8: Excellent agreement of fitted and theoretical distri- 
butions for m = 1000. 
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(17) 



(18) 



For fitting purposes we used aU theoretical Pcum{k) 
values, except Pc«m(l), because its value is clearly not 
determined by the scale-free-like behavior as opposed to 
all other k values. Both distributions fit our overall re- 
sults very well, as presented in Tabled and Figs. 013 
and|Sl The correlation coefficients of the fitted distribu- 
tions are all above 0.99 margin, proving that both fitting 
models are capable of describing theoretically obtained 
curves very well. The power law with the exponential 
cut off always has just a slightly higher correlation coef- 
ficients than stretched exponential for the same sample 
size m. Figures clearly show that the reason for this be- 
havior is much better description of the tail, which power 
law with exponential cut off exhibits. Stretched exponen- 
tial is clearly not suitable for the description of the tail 
properties. 

It is worthy to mention that the power-law distribution 
with the exponential cut off has already been obtained in 
a similar model 15], which has shown that exponential 
parameter a is trivially connected with the sample size m 
by the relation a — —. Although one cannot expect this 
relation to be valid for this model also, the parameter 
a is very close to l/m, and this coincidence is better 
for larger m, as can be seen in table In our opinion, 
it would be interesting to measure a in some observed 
network distributions of a similar shape and compare it 
with the expected sizes of samples on which the new node 
has the possibility of creating a link. 

Finally, let us briefly discuss simulational results for 
cij > 1. Simulational results for the cumulative probaba- 
bility distribution (without P(w)) are displayed in Fig. |51 
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corr 


m=10 

m=100 

m=1000 


0.5501 
0.4026 
0.3067 


0.0765 
0.0092 
0.0011 


0.9980 
0.9995 
0.9994 


0.9829 
0.3894 
0.2230 


0.4718 
0.4385 
0.3823 


0.9976 
0.9987 
0.9978 



TABLE I: Fitted distribution parameters for different sam- 
ple sizes. Correlation coefficients show excellent agreement 
between the theoretical distribution data and the presented 
fits. 



The typical characteristics of the distribution are equiv- 
alent to the oj = 1 case. The degree k = u has substan- 
tially larger probability compared to all other degrees. 
The cumulative probability distribution for k > 2 ob- 
tained in the simulations displays the scale- free- like prop- 
erties. These simulational distributions can be well fitted 
with the power law with the exponential cut off (|18|l as 
shown in Fig. El 




FIG. 9: Evidence that the distributions for uj > 1 fall in 
the same class as the distributions studied analytically. The 
situation with m = 100 is presented. 
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V. CONCLUSION 

We have shown that using a simple "winner takes it 
all" algorithm, together with the fact that nodes do not 
possesses complete information on network structure, a 
macroscopic node-degree power law is created. Evolu- 
tion of real networks is still an open question and we 
have shown that realistic imperfect knowledge can have 
a substantial effect on network growth. Although the 
field of complex networks has made great progress dur- 
ing the last few years, there is still much open space for 
research of microscopic models that describe the forma- 
tion of complex networks with certain expected features. 
Our results clearly show that stohastic-deterministic pro- 



cesses even as simple as that described in this paper can 
be used to reproduce some macroscopic effects of com- 
plex networks. Moreover, in this paper as well as in 
we have demonstrated that the power law with the expo- 
nential cut off can be a significant distribution for types 
of networks in which information filterin g is performed. 
New findings in social contact networks (22| lead us to 
believe that the power law with the exponential cut off 
and stretched exponentials should be studied more inten- 
sively in the future. 
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